11 resultados para Repetitive DNA sequences

em Duke University


Relevância:

90.00% 90.00%

Publicador:

Resumo:

We have used analytical ultracentrifugation to characterize the binding of the methionine repressor protein, MetJ, to synthetic oligonucleotides containing zero to five specific recognition sites, called metboxes. For all lengths of DNA studied, MetJ binds more tightly to repeats of the consensus sequence than to naturally occurring metboxes, which exhibit a variable number of deviations from the consensus. Strong cooperative binding occurs only in the presence of two or more tandem metboxes, which facilitate protein-protein contacts between adjacent MetJ dimers, but weak affinity is detected even with DNA containing zero or one metbox. The affinity of MetJ for all of the DNA sequences studied is enhanced by the addition of SAM, the known cofactor for MetJ in the cell. This effect extends to oligos containing zero or one metbox, both of which bind two MetJ dimers. In the presence of a large excess concentration of metbox DNA, the effect of cooperativity is to favor populations of DNA oligos bound by two or more MetJ dimers rather than a stochastic redistribution of the repressor onto all available metboxes. These results illustrate the dynamic range of binding affinity and repressor assembly that MetJ can exhibit with DNA and the effect of the corepressor SAM on binding to both specific and nonspecific DNA.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We consider the problem of variable selection in regression modeling in high-dimensional spaces where there is known structure among the covariates. This is an unconventional variable selection problem for two reasons: (1) The dimension of the covariate space is comparable, and often much larger, than the number of subjects in the study, and (2) the covariate space is highly structured, and in some cases it is desirable to incorporate this structural information in to the model building process. We approach this problem through the Bayesian variable selection framework, where we assume that the covariates lie on an undirected graph and formulate an Ising prior on the model space for incorporating structural information. Certain computational and statistical problems arise that are unique to such high-dimensional, structured settings, the most interesting being the phenomenon of phase transitions. We propose theoretical and computational schemes to mitigate these problems. We illustrate our methods on two different graph structures: the linear chain and the regular graph of degree k. Finally, we use our methods to study a specific application in genomics: the modeling of transcription factor binding sites in DNA sequences. © 2010 American Statistical Association.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The extinction of the giant tortoises of the Seychelles Archipelago has long been suspected but is not beyond doubt. A recent morphological study of the giant tortoises of the western Indian Ocean concluded that specimens of two native Seychelles species survive in captivity today alongside giant tortoises of Aldabra, which are numerous in zoos as well as in the wild. This claim has been controversial because some of the morphological characters used to identify these species, several measures of carapace morphology, are reputed to be quite sensitive to captive conditions. Nonetheless, the potential survival of giant tortoise species previously thought extinct presents an exciting scenario for conservation. We used mitochondrial DNA sequences and nuclear microsatellites to examine the validity of the rediscovered species of Seychelles giant tortoises. Our results indicate that the morphotypes suspected to represent Seychelles species do not show levels of variation and genetic structuring consistent with long periods of reproductive isolation. We found no variation in the mitochondrial control region among 55 individuals examined and no genetic structuring in eight microsatellite loci, pointing to the survival of just a single lineage of Indian Ocean tortoises.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Cryptococcus neoformans is a pathogenic basidiomycetous yeast responsible for more than 600,000 deaths each year. It occurs as two serotypes (A and D) representing two varieties (i.e. grubii and neoformans, respectively). Here, we sequenced the genome and performed an RNA-Seq-based analysis of the C. neoformans var. grubii transcriptome structure. We determined the chromosomal locations, analyzed the sequence/structural features of the centromeres, and identified origins of replication. The genome was annotated based on automated and manual curation. More than 40,000 introns populating more than 99% of the expressed genes were identified. Although most of these introns are located in the coding DNA sequences (CDS), over 2,000 introns in the untranslated regions (UTRs) were also identified. Poly(A)-containing reads were employed to locate the polyadenylation sites of more than 80% of the genes. Examination of the sequences around these sites revealed a new poly(A)-site-associated motif (AUGHAH). In addition, 1,197 miscRNAs were identified. These miscRNAs can be spliced and/or polyadenylated, but do not appear to have obvious coding capacities. Finally, this genome sequence enabled a comparative analysis of strain H99 variants obtained after laboratory passage. The spectrum of mutations identified provides insights into the genetics underlying the micro-evolution of a laboratory strain, and identifies mutations involved in stress responses, mating efficiency, and virulence.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Centromeres are chromosomal loci essential for genome stability. Their malfunction can cause chromosome instability associated with cancer, infertility, and birth defects. This study focused on an intriguing centromere on human chromosome 17, which displays normal functional variation. Centromere identity can be found on either of two large arrays of repetitive DNA. We investigated inter-individual sequence variation on these two arrays and found association between array size, array variation, and centromere function. Our data suggest a functional influence of DNA sequence at this critical epigenetic locus.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Cellular stresses activate the tumor suppressor p53 protein leading to selective binding to DNA response elements (REs) and gene transactivation from a large pool of potential p53 REs (p53REs). To elucidate how p53RE sequences and local chromatin context interact to affect p53 binding and gene transactivation, we mapped genome-wide binding localizations of p53 and H3K4me3 in untreated and doxorubicin (DXR)-treated human lymphoblastoid cells. We examined the relationships among p53 occupancy, gene expression, H3K4me3, chromatin accessibility (DNase 1 hypersensitivity, DHS), ENCODE chromatin states, p53RE sequence, and evolutionary conservation. We observed that the inducible expression of p53-regulated genes was associated with the steady-state chromatin status of the cell. Most highly inducible p53-regulated genes were suppressed at baseline and marked by repressive histone modifications or displayed CTCF binding. Comparison of p53RE sequences residing in different chromatin contexts demonstrated that weaker p53REs resided in open promoters, while stronger p53REs were located within enhancers and repressed chromatin. p53 occupancy was strongly correlated with similarity of the target DNA sequences to the p53RE consensus, but surprisingly, inversely correlated with pre-existing nucleosome accessibility (DHS) and evolutionary conservation at the p53RE. Occupancy by p53 of REs that overlapped transposable element (TE) repeats was significantly higher (p<10-7) and correlated with stronger p53RE sequences (p<10-110) relative to nonTE-associated p53REs, particularly for MLT1H, LTR10B, and Mer61 TEs. However, binding at these elements was generally not associated with transactivation of adjacent genes. Occupied p53REs located in L2-like TEs were unique in displaying highly negative PhyloP scores (predicted fast-evolving) and being associated with altered H3K4me3 and DHS levels. These results underscore the systematic interaction between chromatin status and p53RE context in the induced transactivation response. This p53 regulated response appears to have been tuned via evolutionary processes that may have led to repression and/or utilization of p53REs originating from primate-specific transposon elements.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Some patients with cancer never develop metastasis, and their host response might provide cues for innovative treatment strategies. We previously reported an association between autoantibodies against complement factor H (CFH) and early-stage lung cancer. CFH prevents complement-mediated cytotoxicity (CDC) by inhibiting formation of cell-lytic membrane attack complexes on self-surfaces. In an effort to translate these findings into a biologic therapy for cancer, we isolated and expressed DNA sequences encoding high-affinity human CFH antibodies directly from single, sorted B cells obtained from patients with the antibody. The co-crystal structure of a CFH antibody-target complex shows a conformational change in the target relative to the native structure. This recombinant CFH antibody causes complement activation and release of anaphylatoxins, promotes CDC of tumor cell lines, and inhibits tumor growth in vivo. The isolation of anti-tumor antibodies derived from single human B cells represents an alternative paradigm in antibody drug discovery.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To provide biological insights into transcriptional regulation, a couple of groups have recently presented models relating the promoter DNA-bound transcription factors (TFs) to downstream gene’s mean transcript level or transcript production rates over time. However, transcript production is dynamic in response to changes of TF concentrations over time. Also, TFs are not the only factors binding to promoters; other DNA binding factors (DBFs) bind as well, especially nucleosomes, resulting in competition between DBFs for binding at same genomic location. Additionally, not only TFs, but also some other elements regulate transcription. Within core promoter, various regulatory elements influence RNAPII recruitment, PIC formation, RNAPII searching for TSS, and RNAPII initiating transcription. Moreover, it is proposed that downstream from TSS, nucleosomes resist RNAPII elongation.

Here, we provide a machine learning framework to predict transcript production rates from DNA sequences. We applied this framework in the S. cerevisiae yeast for two scenarios: a) to predict the dynamic transcript production rate during the cell cycle for native promoters; b) to predict the mean transcript production rate over time for synthetic promoters. As far as we know, our framework is the first successful attempt to have a model that can predict dynamic transcript production rates from DNA sequences only: with cell cycle data set, we got Pearson correlation coefficient Cp = 0.751 and coefficient of determination r2 = 0.564 on test set for predicting dynamic transcript production rate over time. Also, for DREAM6 Gene Promoter Expression Prediction challenge, our fitted model outperformed all participant teams, best of all teams, and a model combining best team’s k-mer based sequence features and another paper’s biologically mechanistic features, in terms of all scoring metrics.

Moreover, our framework shows its capability of identifying generalizable fea- tures by interpreting the highly predictive models, and thereby provide support for associated hypothesized mechanisms about transcriptional regulation. With the learned sparse linear models, we got results supporting the following biological insights: a) TFs govern the probability of RNAPII recruitment and initiation possibly through interactions with PIC components and transcription cofactors; b) the core promoter amplifies the transcript production probably by influencing PIC formation, RNAPII recruitment, DNA melting, RNAPII searching for and selecting TSS, releasing RNAPII from general transcription factors, and thereby initiation; c) there is strong transcriptional synergy between TFs and core promoter elements; d) the regulatory elements within core promoter region are more than TATA box and nucleosome free region, suggesting the existence of still unidentified TAF-dependent and cofactor-dependent core promoter elements in yeast S. cerevisiae; e) nucleosome occupancy is helpful for representing +1 and -1 nucleosomes’ regulatory roles on transcription.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reports a new strategy, recursive directional ligation by plasmid reconstruction (PRe-RDL), to rapidly clone highly repetitive polypeptides of any sequence and specified length over a large range of molecular weights. In a single cycle of PRe-RDL, two halves of a parent plasmid, each containing a copy of an oligomer, are ligated together, thereby dimerizing the oligomer and reconstituting a functional plasmid. This process is carried out recursively to assemble an oligomeric gene with the desired number of repeats. PRe-RDL has several unique features that stem from the use of type IIs restriction endonucleases: first, PRe-RDL is a seamless cloning method that leaves no extraneous nucleotides at the ligation junction. Because it uses type IIs endonucleases to ligate the two halves of the plasmid, PRe-RDL also addresses the major limitation of RDL in that it abolishes any restriction on the gene sequence that can be oligomerized. The reconstitution of a functional plasmid only upon successful ligation in PRe-RDL also addresses two other limitations of RDL: the significant background from self-ligation of the vector observed in RDL, and the decreased efficiency of ligation due to nonproductive circularization of the insert. PRe-RDL can also be used to assemble genes that encode different sequences in a predetermined order to encode block copolymers or append leader and trailer peptide sequences to the oligomerized gene.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The short arms of the ten acrocentric human chromosomes share several repetitive DNAs, including ribosomal RNA genes (rDNA). The rDNA arrays correspond to nucleolar organizing regions that coalesce each cell cycle to form the nucleolus. Telomere disruption by expressing a mutant version of telomere binding protein TRF2 (dnTRF2) causes non-random acrocentric fusions, as well as large-scale nucleolar defects. The mechanisms responsible for acrocentric chromosome sensitivity to dysfunctional telomeres are unclear. In this study, we show that TRF2 normally associates with the nucleolus and rDNA. However, when telomeres are crippled by dnTRF2 or RNAi knockdown of TRF2, gross nucleolar and chromosomal changes occur. We used the controllable dnTRF2 system to precisely dissect the timing and progression of nucleolar and chromosomal instability induced by telomere dysfunction, demonstrating that nucleolar changes precede the DNA damage and morphological changes that occur at acrocentric short arms. The rDNA repeat arrays on the short arms decondense, and are coated by RNA polymerase I transcription binding factor UBF, physically linking acrocentrics to one another as they become fusogenic. These results highlight the importance of telomere function in nucleolar stability and structural integrity of acrocentric chromosomes, particularly the rDNA arrays. Telomeric stress is widely accepted to cause DNA damage at chromosome ends, but our findings suggest that it also disrupts chromosome structure beyond the telomere region, specifically within the rDNA arrays located on acrocentric chromosomes. These results have relevance for Robertsonian translocation formation in humans and mechanisms by which acrocentric-acrocentric fusions are promoted by DNA damage and repair.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ferns are one of the few remaining major clades of land plants for which a complete genome sequence is lacking. Knowledge of genome space in ferns will enable broad-scale comparative analyses of land plant genes and genomes, provide insights into genome evolution across green plants, and shed light on genetic and genomic features that characterize ferns, such as their high chromosome numbers and large genome sizes. As part of an initial exploration into fern genome space, we used a whole genome shotgun sequencing approach to obtain low-density coverage (∼0.4X to 2X) for six fern species from the Polypodiales (Ceratopteris, Pteridium, Polypodium, Cystopteris), Cyatheales (Plagiogyria), and Gleicheniales (Dipteris). We explore these data to characterize the proportion of the nuclear genome represented by repetitive sequences (including DNA transposons, retrotransposons, ribosomal DNA, and simple repeats) and protein-coding genes, and to extract chloroplast and mitochondrial genome sequences. Such initial sweeps of fern genomes can provide information useful for selecting a promising candidate fern species for whole genome sequencing. We also describe variation of genomic traits across our sample and highlight some differences and similarities in repeat structure between ferns and seed plants.